Copula-based synthetic data augmentation for machine-learning emulators

نویسندگان

چکیده

Abstract. Can we improve machine-learning (ML) emulators with synthetic data? If data are scarce or expensive to source and a physical model is available, statistically generated may be useful for augmenting training sets cheaply. Here explore the use of copula-based models generating synthetically augmented datasets in weather climate by testing method on toy downwelling longwave radiation corresponding neural network emulator. Results show that copula-augmented datasets, predictions improved up 62 % mean absolute error (from 1.17 0.44 W m−2).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning Models for Housing Prices Forecasting using Registration Data

This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...

متن کامل

Attacks on Virtual Machine Emulators

As virtual machine emulators have become commonplace in the analysis of malicious code, malicious code has started to fight back. This paper describes known attacks against the most widely used virtual machine emulators (VMware and VirtualPC). This paper also demonstrates newly discovered attacks on other virtual machine emulators (Bochs, Hydra, QEMU, and Xen), and describes how to defend again...

متن کامل

Copula-Based Approach to Synthetic Population Generation

Generating synthetic baseline populations is a fundamental step of agent-based modeling and simulation, which is growing fast in a wide range of socio-economic areas including transportation planning research. Traditionally, in many commercial and non-commercial microsimulation systems, the iterative proportional fitting (IPF) procedure has been used for creating the joint distribution of indiv...

متن کامل

Identification Psychological Disorders Based on Data in Virtual Environments Using Machine Learning

Introduction: Psychological disorders is one of the most problematic and important issue in today's society. Early prognosis of these disorders matters because receiving professional help at the appropriate time could improve the quality of life of these patients. Recently, researches use social media as a form of new tools in identifying psychological disorder. It seems that through the use of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Geoscientific Model Development

سال: 2021

ISSN: ['1991-9603', '1991-959X']

DOI: https://doi.org/10.5194/gmd-14-5205-2021